Rejection Sampling Variational Inference

نویسندگان

  • Christian A. Naesseth
  • Francisco J. R. Ruiz
  • Scott W. Linderman
  • David M. Blei
چکیده

Variational inference using the reparameterization trick has enabled large-scale approximate Bayesian inference in complex probabilistic models, leveraging stochastic optimization to sidestep intractable expectations. The reparameterization trick is applicable when we can simulate a random variable by applying a (differentiable) deterministic function on an auxiliary random variable whose distribution is fixed. For many distributions of interest (such as the gamma or Dirichlet), simulation of random variables relies on rejection sampling. The discontinuity introduced by the accept–reject step means that standard reparameterization tricks are not applicable. We propose a new method that lets us leverage reparameterization gradients even when variables are outputs of a rejection sampling algorithm. Our approach enables reparameterization on a larger class of variational distributions. In several studies of real and synthetic data, we show that the variance of the estimator of the gradient is significantly lower than other state-of-the-art methods. This leads to faster convergence of stochastic optimization variational inference. Let p(x, z) be a probabilistic model, i.e., a joint probability distribution of data x and latent (unobserved) variables z. In Bayesian inference, we are interested in the posterior distribution p(z|x) = p(x,z) p(x) . For most models, the posterior distribution is analytically intractable and we have to use an approximation, such as Monte Carlo methods or variational inference. In this paper, we focus on variational inference. In variational inference, we approximate the posterior with a variational family of distributions q(z ; θ), parameterized by θ. Typically, we choose the variational parameters θ that minimize the Kullback-Leibler (KL) divergence between q(z ; θ) and p(z|x). This minimization is equivalent to maximizing the evidence lower bound (ELBO) [Jordan et al., 1999], L(θ) = Eq(z ;θ) [f(z)] +H[q(z ; θ)], f(z) := log p(x, z), H[q(z ; θ)] := Eq(z ;θ)[− log q(z ; θ)]. (1) When the model and variational family satisfy conjugacy requirements, we can use coordinate ascent to find a local optimum of the ELBO [Ghahramani and Beal, 2001, Blei et al., 2016]. If the conjugacy requirements are not satisfied, a common approach is to build a Monte Carlo estimator of the gradient of the ELBO [Paisley et al., 2012, Ranganath et al., 2014, Salimans and Knowles, 2013]. Empirically, the reparameterization trick has been shown to be beneficial over direct Monte Carlo estimation of the gradient using the score fuction estimator [Rezende et al., 2014, Kingma and Welling, 2014, Titsias and Lázaro-Gredilla, 2014, Fan et al., 2015]. However, it is not generally applicable, it requires that: (i) the latent variables z are continuous; and (ii) we can simulate from q(z ; θ) as follows, z = h(ε, θ), with ε ∼ s(ε). (2) Here, s(ε) is a distribution that does not depend on the variational parameters; it is typically a standard normal or a standard uniform. Further, h(ε, θ) is differentiable with respect to θ. Using (2), we can move the derivative inside the expectation and rewrite the gradient of the ELBO as ∇θL(θ) = Es(ε) [∇zf(h(ε, θ))∇θh(ε, θ)] +∇θH[q(z ; θ)]. ∗Corresponding author: [email protected] Advances in Approximate Bayesian Inference (NIPS 2016 Workshop), Barcelona, Spain. Algorithm 1 Reparameterized Rejection Sampling Input: target q(z ; θ), proposal r(z ; θ), and constant Mθ, with q(z ; θ) ≤Mθr(z ; θ) Output: ε such that h(ε, θ) ∼ q(z ; θ) 1: i← 0 2: repeat 3: i← i+ 1 4: Propose εi ∼ s(ε) 5: Simulate ui ∼ U [0, 1] 6: until ui < q(h(εi,θ) ;θ) Mθr(h(εi,θ) ;θ) 7: return εi 6 4 2 0 2 4 6 ε 10 10 10 10 10 10 10 10 10

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Variational Rejection Sampling

Learning latent variable models with stochastic variational inference is challenging when the approximate posterior is far from the true posterior, due to high variance in the gradient estimates. We propose a novel rejection sampling step that discards samples from the variational posterior which are assigned low likelihoods by the model. Our approach provides an arbitrarily accurate approximat...

متن کامل

Reparameterization Gradients through Acceptance-Rejection Sampling Algorithms

Variational inference using the reparameterization trick has enabled large-scale approximate Bayesian inference in complex probabilistic models, leveraging stochastic optimization to sidestep intractable expectations. The reparameterization trick is applicable when we can simulate a random variable by applying a differentiable deterministic function on an auxiliary random variable whose distrib...

متن کامل

Simulation I

We have spent considerable effort in this class approximating intractable probability distributions. These approximations have several uses: they allow us to give a compact (but fairly accurate) representation for a complicated data set, or allow us to perform prediction or inference that would otherwise be impossible. The last few lectures have focused on deterministic methods for approximatin...

متن کامل

Hybrid Variational/Gibbs Collapsed Inference in Topic Models

Variational Bayesian inference and (collapsed) Gibbs sampling are the two important classes of inference algorithms for Bayesian networks. Both have their advantages and disadvantages: collapsed Gibbs sampling is unbiased but is also inefficient for large count values and requires averaging over many samples to reduce variance. On the other hand, variational Bayesian inference is efficient and ...

متن کامل

Variational probabilistic inference and the QMR - DT databaseTommi

We describe a variational approximation method for eecient inference in large-scale probabilistic models. Variational methods are deterministic procedures that provide approximations to marginal and conditional probabilities of interest. They provide alternatives to approximate inference methods based on stochastic sampling or search. We describe a variational approach to the problem of diagnos...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016